1 Mount Hood Environmental, PO Box 1303, Challis, Idaho, 83226, USA
2 Mount Hood Environmental, 39085 Pioneer Boulevard #100 Mezzanine, Sandy, Oregon, 97055, USA
3 Mount Hood Environmental, PO Box 4282, McCall, Idaho, 83638, USA

Correspondence: Bryce N. Oldemeyer <>, Mark Roes <>

1 Background

Habitat covariates recently used in six quantile random forest (QRF) capacity models (Chinook salmon and steelhead; summer parr, winter presmolts and redds) were chosen because of their high predictive power to estimate capacity across the Columbia River Basin (cite IRA?). However, a subset of the covariates included the QRF models were not necessarily useful for restoration project monitoring, or to describe target conditions for restoration design due to the habitat covariate not being easily manipulated by project actions. Additionally, some of the CHaMP covariates used in previous models are difficult to replicate or measure using streamlined fish habitat protocols (DASH - Carmichael et al. 2019). To increase the utility of the QRF model for project monitoring/design and future data collection efforts, we explored alternative covariates to include in the QRF models that: 1) maintained high predictive power, 2) were informative for restoration efforts and monitoring, 3) could be calculated from DASH surveys, 4) were not missing an overabundance of data, and 5) were not highly correlated with other covariates in the models. Using this criterion, we developed six modified QRF model that were more informative for restoration design and monitoring, included covariates that could be calculated using newly developed stream habitat protocols, and maintained a similar level of predictive power as the original QRF habitat capacity models.

Similarly, a random forest extrapolation model was used to predict capacity estimates across larger scales where CHaMP or DASH data are absent (cite IRA). We revisited the globally available attributes (GAA) included in the original random forest extrapolation model and made minor modifications so GAA’s included in the extrapolation model better aligned with the modified QRF model covariates.

Below is a brief outline documenting these efforts, as well as a comparison of extrapolation estimates for the original and modified QRF models for eight watersheds in the Upper Salmon River region.

2 Covariate selection

The QRF model was fit using a revised covariate selection process that placed more emphasis on compatibility with future data collection via DASH and the ability to predict restoration effects. Habitat data collected by CHaMP and other sources (e.g. NorWeST stream temperature) were subset to a list of potential covariates that could be reproduced using DASH data collection. This provides opportunity to collect new paired fish and habitat data using DASH protocols, reducing the reliance of the QRF model on the CHaMP habitat data.

Below is a rubric that was used to help inform the covariate selection process for each of the six models (Chinook and steelhead; winter, summer, and redds)

  1. Strength between covariate and response variable (based on MIC score)

  2. Informative for restoration efforts (Yes/No)

  3. Could be calculated using DASH data (Yes/No)

  4. How much data were missing and/or the amount of “0”s?

  5. How correlated was the covariate with other covariates in the original QRF model, and covariates within the same model?

An oversimplified example of the theoretical covariate selection process might unfold as follows. In the original QRF model, discharge might have been included in the model because it had a high MIC score and it made biological sense. Unfortunately, discharge isn’t that informative for restoration efforts because most restoration efforts can’t create water. Discharge (like many habitat covariates) are highly correlated to other habitat covariates, but these other covariates maybe have been left out of the original QRF model for any number of reasons (highly correlated with other covariates already in the model, redundant, etc.). Using the rubric, we found that average thalweg depth had a MIC score that was nearly as high as discharge, it was informative for restoration, it could be calculated with DASH, and the two covariates were highly correlated (the high correlation is likely why average thalweg depth was left out of the original QRF model). Based on all the information above, we would substitute mean thalweg depth for discharge in that particular model. Repeat this process for all other QRF covariates for each of the six models.

Briefly discuss how we compared outputs between species and settled on a joint model.

3 QRF Model fit

Talk about relative importance and pdp plots.

4 Extrapolation model

The spatial extent of QRF capacity predictions was limited to reaches CHaMP habitat data, so capacity for all wadable streams in the Columbia basin was estimated through the development of an extrapolation model. This model used ‘globally available attributes’ (GAAs) obtained from a stream layer created by Morgan Bond and Tyler Nodine based on the National Hydrography Dataset High Resolution 1:24,000 line network to estimate capacities predicted by the QRF model at the 200 meter reach scale.

4.1 Extrapolation comparison

Extrapolations of habitat capacity for Chinook salmon, by life-stage, for the eight watersheds within the Upper Salmon River Basin using the modified models.

Figure 4.1: Extrapolations of habitat capacity for Chinook salmon, by life-stage, for the eight watersheds within the Upper Salmon River Basin using the modified models.

Extrapolations of habitat capacity for steelhead, by life-stage, for the eight watersheds within the Upper Salmon River Basin using the modified models.

Figure 4.2: Extrapolations of habitat capacity for steelhead, by life-stage, for the eight watersheds within the Upper Salmon River Basin using the modified models.

5 Habitat Capacity Estimates

5.0.1 Chinook Salmon

Table 5.1: Predicted Chinook salmon habitat capacity by life-stage and watershed using the modified models.
Watershed Juv summer capacity Summer SE Juv winter capacity Winter SE Redd capacity Redd SE
EF Salmon 1,926,623 226,925.7 138,214 32,880 402 21
Lemhi 786,452 62,659.8 141,515 15,359 353 11
NF Salmon 339,275 50,147.9 70,462 10,409 166 8
Pahsimeroi 265,099 18,409.2 86,999 9,781 139 4
Panther Cr 1,219,542 118,369.5 201,265 22,296 448 17
Upper Salmon 3,301,286 352,419.5 166,522 45,582 575 29
Valley Cr 1,902,198 207,362.9 115,517 32,535 394 20
Yankee Fork 2,144,056 274,555.8 119,298 28,783 438 23

5.0.2 Steelhead

Table 5.2: Predicted steelhead habitat capacity by life-stage and watershed using the modified models.
Watershed Juv summer capacity Summer SE Juv winter capacity Winter SE Redd capacity Redd SE
EF Salmon 252,597 15,520.5 337,682 36,795 413 24
Lemhi 310,577 9,082.3 363,898 27,441 441 18
NF Salmon 242,471 18,381.8 313,118 27,955 323 22
Pahsimeroi 159,705 6,225.1 205,921 13,951 198 8
Panther Cr 268,476 13,598.0 339,671 19,946 317 15
Upper Salmon 243,548 14,843.6 310,879 39,013 452 32
Valley Cr 176,048 10,707.6 288,579 31,329 365 26
Yankee Fork 197,926 12,378.9 341,310 38,555 449 36

5.1 Comparison with previous extrapolation

Below are comparisons with the results from the previous QRF model and random forest extrapolation.

5.1.1 Chinook

Comparison of Chinook salmon habitat capacity estimates between revised and original model extrapolation, by life-stage, for the eight watersheds within the Upper Salmon River Basin.

Figure 5.1: Comparison of Chinook salmon habitat capacity estimates between revised and original model extrapolation, by life-stage, for the eight watersheds within the Upper Salmon River Basin.

Table 5.3: Estimated chinook capacities and comparison with previous random forest extrapolations for eight watersheds
Model Watershed Predicted capacity Capacity % change Predicted capacity SE SE % change
Juv summer EF Salmon 1,926,623.4 112 226,926 186
Juv summer Lemhi 786,451.7 112 62,660 172
Juv summer NF Salmon 339,275.4 13 50,148 100
Juv summer Pahsimeroi 265,099.3 45 18,409 54
Juv summer Panther Cr 1,219,541.6 21 118,369 33
Juv summer Upper Salmon 3,301,286.0 163 352,419 205
Juv summer Valley Cr 1,902,197.5 152 207,363 191
Juv summer Yankee Fork 2,144,056.4 222 274,556 284
Juv winter EF Salmon 138,214.5 0 32,880 139
Juv winter Lemhi 141,514.7 -8 15,359 127
Juv winter NF Salmon 70,462.3 28 10,409 106
Juv winter Pahsimeroi 86,999.4 -8 9,781 44
Juv winter Panther Cr 201,265.5 29 22,296 122
Juv winter Upper Salmon 166,521.7 -29 45,582 87
Juv winter Valley Cr 115,516.8 -12 32,535 145
Juv winter Yankee Fork 119,298.3 20 28,783 122
Redds EF Salmon 401.9 -13 21 -29
Redds Lemhi 353.0 5 11 19
Redds NF Salmon 165.7 -5 8 -4
Redds Pahsimeroi 139.4 25 4 17
Redds Panther Cr 447.8 -4 17 -14
Redds Upper Salmon 575.0 -20 29 -41
Redds Valley Cr 393.7 -29 20 -44
Redds Yankee Fork 438.2 -38 23 -59

5.1.2 Steelhead

Comparison of steelhead habitat capacity estimates between modified  and original models extrapolation, by life-stage, for the eight watersheds within the Upper Salmon River Basin.

Figure 5.2: Comparison of steelhead habitat capacity estimates between modified and original models extrapolation, by life-stage, for the eight watersheds within the Upper Salmon River Basin.

Table 5.4: Estimated steelhead capacities and comparison with previous random forest extrapolations for eight watersheds
Model Watershed Predicted capacity Capacity % change Predicted capacity SE SE % change
Juv summer EF Salmon 252,597.2 -31 15,521 -1
Juv summer Lemhi 310,577.2 -15 9,082 11
Juv summer NF Salmon 242,471.4 -5 18,382 34
Juv summer Pahsimeroi 159,705.1 -18 6,225 14
Juv summer Panther Cr 268,475.9 -8 13,598 42
Juv summer Upper Salmon 243,548.0 -31 14,844 -11
Juv summer Valley Cr 176,047.8 -28 10,708 -11
Juv summer Yankee Fork 197,926.3 -29 12,379 38
Juv winter EF Salmon 337,681.9 -14 36,795 30
Juv winter Lemhi 363,897.6 -8 27,441 52
Juv winter NF Salmon 313,118.1 -1 27,955 4
Juv winter Pahsimeroi 205,921.2 -4 13,951 21
Juv winter Panther Cr 339,671.3 8 19,946 25
Juv winter Upper Salmon 310,878.6 -26 39,013 20
Juv winter Valley Cr 288,579.2 -14 31,329 10
Juv winter Yankee Fork 341,309.6 -18 38,555 -7
Redds EF Salmon 413.2 -13 24 -13
Redds Lemhi 441.4 10 18 12
Redds NF Salmon 323.4 -10 22 22
Redds Pahsimeroi 198.5 2 8 -8
Redds Panther Cr 317.0 -7 15 15
Redds Upper Salmon 452.1 -11 32 -6
Redds Valley Cr 365.0 -20 26 -7
Redds Yankee Fork 448.9 -25 36 13

6 Supplemental figures and tables

6.1 Capacity by stream

6.1.1 Chinook

6.1.2 Steelhead